Support simple-stories lms #3

danbraunai-apollo · 2025-04-22T05:58:59Z

Description

Allows for running APD on simple-stories models.
Only includes the loss functions necessary for our initial exploration, and not everything that is supported for tms and resid_mlp.
Adds a streamlit dashboard at spd/experiments/lm/app.py to explore the components for a trained model.

How Has This Been Tested?

None! Notably, no tests yet made for whether the tokens accurately match the text.

Does this PR introduce a breaking change?

No

…recon

* Add layerwise recon * Add layerwise_random_recon_loss * Protect the eyes of mathematicians

* WIP: Add dashboard * Create base_cache_dir if it doesn't exist * Functional dashboard * Add simple-stories-train and datasets to pyproject.toml

danbraunai-apollo added 30 commits February 12, 2025 12:45

Rename some topk_mask vars to mask

b53df86

Implement gating (untested)

0f4e7f8

Fix grad attributions and calc_recon_mse

c784489

Init gate with bias=1 and weights normal dist mean=0 std=0.2

e3c3eb0

Fix lp sparsity loss

15b310c

Add random mask loss

3aff69b

Use relud masks for lp sparsity loss

13b8097

Use masked_target_component_acts in calc_act_recon_mse

0923c0f

Comment out grad attribution calculation so people don't use now

3aceb8a

Store gates in model class

61247dc

Remove buggy tms deprecated params replacement

64c3a23

Tie the gates for TMS

ed32237

Plot masks

60cc056

Fix resid_mlp test (sensitive to float precision)

bc9505c

Add init_from_target for tms

01a03bc

Support init_from_target for resid_mlp

6d6d99f

Normalise lp sparsity by batch size

c303c14

Don't copy biases in init_spd_model_from_target_model

41bd85b

Fix resid_mlp init_from_target test

befac1d

Add randrecon to run label

e7e60a7

Permute to identity for plotting mask_vals

3845ca3

Remove post_relu_act_recon config arg

3bb654c

Remove code from global scope in plotting

ebee911

Handle deprecated 'post_relu_act_recon' arg.

0b3f61d

Use mps if available

931b6f3

Avoid mps as it breaks tms

19d7181

Untie gates in TMS

8560f1b

Allow for detached inputs to gates and use target_out in random_mask_…

79391e9

…recon

Add GateMLP

cd23609

Remove bias_val and train_bias config args

96939c2

danbraunai-apollo and others added 29 commits March 17, 2025 16:00

Load env vars when running sweeps too

5afdc92

Add layerwise recon (#263)

e80f874

* Add layerwise recon * Add layerwise_random_recon_loss * Protect the eyes of mathematicians

Remove transformer-lens dependency

16992e5

Use new random masks for layerwise_random_masks

7f6a94b

Add jaxtyping to dependencies

5c632f9

Add einops dependency

5981df6

Use calc_recon_mse in calc_random_masks_mse_loss for consistency

fcff304

Set bias to zero in GateMLP mlp_out

7ac2a42

WIP: Swap components with Llama nn.Linear modules

037caf1

Fix nn.Linear shape and handle masked components

1a1dcaf

WIP: Add lm_decomposition script

993da44

Fix module paths

fccc189

WIP: Add param_match_loss

3fcf593

Add layerwise recon losses

aa7cacf

Add lp sparsity loss

82b505a

Minor comment and config clean

96ae954

Make components a submodule of SSModel and update model loading

cb12ed1

Add SSModel.from_pretrained()

d3a7c76

WIP: Fix download with weights_only=True

1425354

Merge branch 'main' into feature/lm

8ba8ca9

Calc mask l0 for lms

7a23520

Merge branch 'main' into feature/lm

2706112

Fix missing GateMLP type references

0103c0c

Merge branch 'feature/lm' into feature/lm-temp

bcd3e09

Update component_viz for new model format

60fa3cc

Plot mean components during apd run

04bcbe1

Re-organise wandb logging

c2bdda1

Add streamlit dashboard for lm (#2)

072085e

* WIP: Add dashboard * Create base_cache_dir if it doesn't exist * Functional dashboard * Add simple-stories-train and datasets to pyproject.toml

Remove unused set_nested_module_attr function

04a2138

danbraunai-apollo merged commit f47c5b4 into dev Apr 22, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support simple-stories lms #3

Support simple-stories lms #3

Uh oh!

danbraunai-apollo commented Apr 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Support simple-stories lms #3

Support simple-stories lms #3

Uh oh!

Conversation

danbraunai-apollo commented Apr 22, 2025

Description

How Has This Been Tested?

Does this PR introduce a breaking change?

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants